A Laplacian Eigenmaps Based Semantic Similarity Measure between Words

نویسندگان

  • Yuming Wu
  • Cungen Cao
  • Shi Wang
  • Dongsheng Wang
چکیده

The measurement of semantic similarity between words is very important in many applicaitons. In this paper, we propose a method based on Laplacian eigenmaps to measure semantic similarity between words. First, we attach semantic features to each word. Second, a similarity matrix ,which semantic features are encoded into, is calculated in the original high-dimensional space. Finally, with the aid of Laplacian eigenmaps, we recalculate the similarities in the target low-dimensional space. The experiment on the Miller-Charles benchmark shows that the similarity measurement in the low-dimensional space achieves a correlation coefficient of 0.812, in contrast with the correlation coefficient of 0.683 calculated in the high-dimensional space, implying a significant improvement of 18.9%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Coloring of DT-MRI Fiber Traces Using Laplacian Eigenmaps

We propose a novel post processing method for visualization of fiber traces from DT-MRI data. Using a recently proposed non-linear dimensionality reduction technique, Laplacian eigenmaps [3], we create a mapping from a set of fiber traces to a low dimensional Euclidean space. Laplacian eigenmaps constructs this mapping so that similar traces are mapped to similar points, given a custom made pai...

متن کامل

Broadcast News Story Segmentation Using Probabilistic Latent Semantic Analysis and Laplacian Eigenmaps

This paper proposes to integrate probabilistic latent semantic analysis (PLSA) and Laplacian Eigenmaps (LE) for broadcast news story segmentation. PLSA can address synonymy and polysemy problems by exploring underlying semantic relations beneath the actual occurrences of words. LE can provide a data transformation with the advantage of preserving the original temporal structure of sentence cohe...

متن کامل

Supervised Laplacian Eigenmaps with Applications in Clinical Diagnostics for Pediatric Cardiology

Electronic health records contain rich textual data which possess critical predictive information for machine-learning based diagnostic aids. However many traditional machine learning methods fail to simultaneously integrate both vector space data and text. We present a supervised method using Laplacian eigenmaps to augment existing machinelearning methods with low-dimensional representations o...

متن کامل

A WordNet-based Semantic Similarity Measure Enhanced by Internet-based Knowledge

Approaches for measuring semantic similarity between words have been widely employed in various areas such as Artificial Intelligence, Linguistics, Cognitive Science and Knowledge Engineering. A new semantic similarity measure is proposed in this paper, which exploits the knowledge retrieved from a semantic network (i.e., WordNet) and the Internet. In particular, the structure information from ...

متن کامل

Algorithm for Semantic Based Similarity Measure

In a document representation model the Semanti based Similarity Measure (SBSM), is proposed. This model combines phrases analysis as well as words analysis with the use of propbank notation as background knowledge to explore better ways of documents representation for clustering. The SBSM assigns semantic weights to both document words and phrases. The new weights reflect the semantic relatedne...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010